An LDA-lexical Syntactical Approach for Events and Features Extraction of Earthquakes from Spanish and English Tweets

نویسندگان

  • Enrique Valeriano Loli
  • Juanjosé Tenorio Peña
  • Rodrigo López Condori
چکیده

In the last few years, social networks like Twitter have been a very useful resource for tracking the events that happened before, during and after an earthquake. Several studies of this topic have applied different techniques like Clustering or Temporal models for extracting these events from Twitter. In this paper, however, we propose a new approach for extracting not only the events that happened in the earthquake but also some of its most prominent features like intensity, epicenter and affected places. We performed a lexical syntactical analysis of Spanish and English tweets in order to find the events that happened, in addition to a semantical analysis using statistical metrics and models like Pointwise Mutual Information(PMI) and Latent Dirichlet Allocation(LDA) for extracting the features of the earthquake. Our results show that, by considering the semantics and syntactics of the tweets, we can extract important events and features of an earthquake, which can be used for online detection and tracking of similar disasters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks

The rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. Therefore, identifying the rumor language can be helpful in identifying it. The previous research has focused more on the contextual information to reply tweets and less on the content features of the original rumor to address the rumor detection problem. Most of the studies have been in...

متن کامل

2016 Olympic Games on Twitter: Sentiment Analysis of Sports Fans Tweets using Big Data Framework

Big data analytics is one of the most important subjects in computer science. Today, due to the increasing expansion of Web technology, a large amount of data is available to researchers. Extracting information from these data is one of the requirements for many organizations and business centers. In recent years, the massive amount of Twitter's social networking data has become a platform for ...

متن کامل

Cross-Language Domain Adaptation

Rapid crisis response requires real-time analysis of messages. After a disaster happens, volunteers attempt to classify tweets to determine needs, e.g., supplies, infrastructure damage, etc. Given labeled data, supervised machine learning can help classify these messages. Scarcity of labeled data causes poor performance in machine training. Can we reuse old tweets to train classifiers? How can ...

متن کامل

Codeswitching Detection via Lexical Features in Conditional Random Fields

Half of the world’s population is estimated to be at least bilingual. Due to this fact many people use multiple languages interchangeably for effective communication. At the Second Workshop on Computational Approaches to Code Switching, we are presented with a task to label codeswitched, Spanish-English (ES-EN) and Modern Standard Arabic-Dialect Arabic (MSA-DA), tweets. We built a Conditional R...

متن کامل

Demonstration of Multi Statutory of the Adjective “Just” in Modern Adjectival English Lexicon

This article concerns the general functional features of an adjective in modern English, and in particular multistate lexical item “just”, which carries different meanings in different variants of combinatorics. The authors analyze the combinations used with the adjectival lexeme item “just” and reveal the categories that determine the degree of semantic content of each given statement. The nee...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017